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Consider the multiple testing problem of testing k null hypothe- 
ses, where the unknown family of distributions is assumed to satisfy 
a certain monotonicity assumption. Attention is restricted to proce- 
dures that control the familywise error rate in the strong sense and 
^ ' which satisfy a monotonicity condition. Under these assumptions, we 

prove certain maximin optimality results for some well-known step- 
down and stepup procedures. 



1. Introduction. For classical single-stage multiple comparison proce- 
^ ■ dures, a number of optimality results are available. (See, e.g., [6], Chapter 

. 11, and [11], Chapter 7 particularly Sections 7.9 and 7.10.) However, no 

I such literature exists for the more recent stepdown and stepup methods. It 

^ ■ is the purpose of the present paper to establish optimality properties for 

, procedures of the latter kind. 

' Our setup and conditions are those of Lehmann [9], who discusses such 

--..^ ■ an optimality result for the testing of two hypotheses. For the general prob- 

I lem of testing k null hypotheses Hi , , consider k random variables 

' Xi , Xk ; typically, these are test statistics for the individual hypothe- 

■ ses Hi, . . . We assume that X = {Xi, . . . ,Xk) has some fc-dimensional 

j>I \ joint cumulative distribution function Fe{-) indexed hy 9 = {6i,...,6k) in 

R^. The null hypothesis Hi states < 0, which is being tested against the 
alternatives 9i > 0. 

I Stepdown procedures were initiated by Holm [7], while the stepup ap- 

proach can be found in [2, 5, 8, 14, 16]. Background material on step- 
wise procedures is provided by Hochberg and Tamhane [6] and Westfall 
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and Young [18]. Roughly speaking, stepdown procedures start by rejecting 
the most significant hypothesis (corresponding to the largest Xi) and then 
they sequentially consider the most significant of the remaining hypotheses. 
Alternatively, stepup procedures start with the least significant hypothesis 
(corresponding to the smallest Xi). 

Our optimality results require crucial monotonicity assumptions and re- 
strictions. We say that a region M of x values is a monotone increasing 
region if x = (xi, . . . , x^) € and Xi < in implies that y = {yi, . . . , y^) is 
also in M . We assume of our model that increased values of lead to higher 
values of X, specifically, that if 6i <ji, then 

(1) / dFe{xi,...,Xk)< dF^{xi,...,Xk) 
JM JM 

for every monotone increasing region Ai. This assumption holds, in partic- 
ular, if the distributions Fq have densities po with (increasing) monotone 
likelihood ratio; that is, if x = (xi, . . . , x^), y = {yi, . . . , y^), 6 = {6i, . . . , 6*^) 
and 6' = {e[,...,e'i^), then 

Pe'jx) ^ pe'jy) 
Pe{x) ~ Pe{y) 

whenever Xj < yi for all i and Oj < 6j for all j. This notion of monotonicity 
was studied in [10]; other notions of stochastic ordering are discussed in [12]. 

In addition to condition (1), we will assume an analogous monotonic- 
ity property for the distribution of {6iXi, ... ,6^X1^), for any 6i € { — 1,1}. 
Specifically, for every monotone increasing region Ai and 6i9i < di'ji, 

(2) / dFg{5iXi, . . . ,6kXk) < dF^{6ixi,...,5kXk). 
JM JM 

For example, the condition for {—Xi, . . . , —Xk) means that for any monotone 
decreasing region A4' (the complement of a monotone increasing region), 
the inequality (1) is reversed; that is, the probability of the event {X £ A4'} 
increases as 6 decreases (in each component). 

Under these assumptions, we shall restrict attention to decision rules sat- 
isfying the following monotonicity condition. A decision rule D based on X 
states for each possible value x of X the subset I = Ix of {1, . . . ,k} of values 
i for which the hypothesis Hi is rejected. A decision rule D is said to be 
monotone if 

Xi < yi for i G but yi < Xi for i ^ 

implies that Ix = ly Thus, the subset of x values that results in rejecting all 
hypotheses is a monotone increasing region. More generally, fix I C {!,... ,k} 
and, based on a monotone decision rule, let Mj denote the set of x values 
such that Ix = F If 5i = 1 for i G / and 6i = —1 otherwise, then 

{(Jixi, . . . , 6kXk) : (xi, . . . , xfc) G Mi} 
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is a monotone increasing set. By assumption (2), the probability of this set 
is increasing in 6i6i. 

Among all monotone decision rules that provide strong control of the 
familywise error rate (FWER), that is, of the probability of committing a 
Type 1 error by wrongly rejecting one or more true hypotheses, under any 
configuration of true and false null hypotheses we shall show how to maxi- 
mize certain aspects of the power of the procedures (i.e., of the probability 
of correctly rejecting false hypotheses). However, we note that we are not 
restricting attention to any kind of stepwise procedure; rather, the resulting 
optimal procedures take the form of well-known stepwise procedures, which 
will be fully described later. 

Here the restriction to monotone procedures is not just for convenience — 
the results are not true without this restriction. It is, in fact, possible to im- 
prove the rejection probability without violating the error control by adding 
small implausible pieces to the rejection regions, resulting in decision rules 
that are very counterintuitive. That this is possible is due to the fact that 
the bound for the error control is not attained but only approached in the 
limit as some parameter values tend to oo or —oo. For a discussion of the 
pros and cons of such counterintuitive decision rules with references to the 
literature, see [13]. 

To conclude this introduction, we mention some situations in which the 
present approach does and some in which it does not apply. As a first ex- 
ample, consider a paired comparison experiment with pairs of observations 
{Yi,Zi). Let E{Yi) = fn and E{Zi) = fj, and consider testing the hypotheses 

= i^i — fit = against the alternatives 9i> 0. If we put Xi = Zi — Yi and 
base our inferences on the X's, this reduces to the situation considered here. 
This example can be extended to the comparison of two treatments with 
and Hi observations {i = 1, . . . ,k), respectively, on k subjects. Another ap- 
plication is the comparison of k treatments with a control. Here 6i = fii — ^q, 
where the Hi {i = 1, . . . ,k) and /io are the means for the k treatments and 
the control, respectively. 

On the other hand, the approach does not apply to the comparison of 
k treatments, that is, the hypothesis H : fii = ■ ■ ■ = fi^, where in the case of 
rejection one wishes to determine the pairs i < j for which fii < fij. As in the 
preceding examples, the hypothesis can be reduced to H -.92 = • • • = Ok = 
with, for example, 9i = fii — ^i. However, with the resulting procedure, one 
can only determine the significant differences fij — fii with i = 1 and not 
those with 1 <i < j. 

In Section 2 we treat the case k = 2 separately. In Section 3 we con- 
sider general k for stepdown procedures, but make a further exchangeability 
assumption. The corresponding results for stepup procedures are then pro- 
vided in Section 4, though a further assumption of monotonicity of critical 
values is invoked. Section 5 is a brief conclusion and all proofs are deferred 
to Section 6. 
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Distributional assumptions. We suppose {Xi, . . . ,Xk) has a joint cumu- 
lative distribution function Fg{-), indexed by 6* = (Si,...,^^) in R*^. The 
parameter space is a finite or infinite open rectangle with 

e[<9,<eY, i = i,...,k. 

Similarly, the sample space is assumed to be a finite or infinite open rectangle 
with 

independent of 6. We further assume the distribution of any subcollection 
{Xi,i G /} depends only on those 9i with i G /, and that Xi tends in prob- 
ability to as 6i — s- and Xi tends in probability to as 0i ^ . 

To ease the notation, we assume here and in the remainder of the paper 
that di varies in all of R, so that 9i = —oo and OY = oo. We also simplify 
the notation by taking xf = —oo and = oo. In addition, we assume that 
the joint distribution of X has a density with respect to Lebesgue measure; 
this is used only so that the critical constants of the optimal procedures can 
be obtained for control at a given level a to be achieved exactly, but this 
hypothesis can certainly be weakened. In order for the critical constants to 
be uniquely defined, we further assume that the joint density is positive on 
its (assumed rectangular) region of support, but this can be weakened as 
well. 

2. The case k = 2. We are testing hypotheses Hi and H2 with Hi cor- 
responding to 9i < 0. Let LVofi denote the part of the parameter space where 
both Hi and H2 are true; let wq,! correspond to the part where Hi is true 
and H2 is not true; similarly for ujifl and uJi^i. 

A decision rule D analogously divides up the sample space into regions 
dQ,o, do^i, di^o and di^i. For example, (io,i corresponds to the region in the 
sample space where Hi is declared true and H2 is declared false. Also, let 
di be the region where Hi is rejected, so di = di^o U di^i and d2 = (io,i U di^i. 

We will restrict attention to rules D that are 

(3) monotone 
and such that the 

(4) FWER < a. 

For e = (£1,62) with Si > 0, consider subsets of elements (^1,^2) defined by 

(5) Ai{e) = {9i>£i}U{e2>e2} 
and 



(6) 



A2{e) = {ei>ei}n{e2>e2}. 
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A decision rule is deemed good if the quantities 



(7) 




and 



(8) 




are large. As we will see, it is not possible to find a rule D satisfying (3) and 
(4) that maximizes (7) and (8) simultaneously. 

In order to appreciate the criteria (7) and (8), first suppose 6 G Ai{e). 
Then at least one 9i is positive and so we would not want to conclude 
that both 9i are < 0; rather, we wish to conclude cIqq. Thus, maximizing 
(7) maximizes the minimum probability that we do not conclude do,o as 9 
varies in Ai{e). Similarly, if G A2{£), then both 9i are positive, and so we 
wish to maximize the (minimum) chance that we make the decision di^i. 

In addition, we also consider the following notion of optimality. Again 
suppose 9 E Ai{e), so that at least one 9i is positive. Then, as above, we do 
not want to make the decision (io,o- However, we also do not wish to make 
the decision (io,i if, in fact. Hi is false and H2 is true; we would rather make 
the correct decision difi. So, we also consider the probability of maximizing 



In other words, the criterion (7) maximizes the minimum probability of 
rejecting exactly least one hypothesis (regardless of which are true and false), 
while criterion (9) maximizes the minimum probability of rejecting at least 
one false hypothesis. The latter criterion seems more compelling, though 
the former criterion might be justified in a situation where it is important 
to know that the joint null hypothesis (i.e., the global hypothesis that both 
hypotheses are true) is not true. In any case, we shall see that the same 
optimal procedure D arises from both criteria. 

Theorem 2.1. Consider the case k = 2 under the assumptions given in 
Section 1. 

(i) A rule D satisfying (3) and (4) maximizes (7) if 



Its minimum (rejection) probability over ^i(e) is given by Ps^{Xi > ai}. 



(9) 



inf Pgjreject at least one false hypothesis}. 

eeAi{e) 




and 
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(ii) The (stepdown) rule D satisfying (3), (4) and (10) that maximizes 



(8) is given by 

(13) do,i = {Xi<h,X2>a2}, 

(14) di,o = {Xi>ai,X2<b2}, 

(15) di,i = {Xi>6i,X2>62}ndg 
where hi satisfies 

(16) Po{Xi >b^} = a 



(and so hi < ai). The minimum probability of di^i over ^2(e) is 
Pei,£2{^i > ai,X2>b2UXi >bi,X2>a2}. 

(iii) The result (i) holds for D if criterion (7) is replaced by (9), and 
(12) is also the maximum value of criterion (9). 

Note that once (io,o and di^i are determined, so are (io,i and di^ (by mono- 
tonicity) . 

The procedure D of Theorem 2.1 is an example of a stepdown procedure. 
It starts by rejecting the most significant hypothesis (corresponding to the 
largest Xi) and it then sequentially considers the most significant of the 
remaining hypotheses. Alternatively, stepup procedures start with the least 
significant hypothesis (corresponding to the smallest Xi), and an optimality 
result is now given for such a procedure. 

Remark 2.1. The proof shows that the optimal procedure D in (i) and 
(ii) is the unique rule satisfying (3) and (4) which maximizes (7), in the 
sense that if E is any other such rule, then eo,o A (io,o has Lebesgue measure 
0, where AAB denotes the symmetric difference between sets A and B. 
Similarly, a rule E satisfying (3), (4) and (10) maximizing (8) must satisfy 
ei^i A di^i has Lebesgue measure 0. 

Also, notice that the optimal procedure D does not depend on e. It follows 
that D is admissible in the following sense. Suppose there exists another 
monotone rule E that controls the FWER, and such that 

(17) P,{4o} < ^4^0} for all 6 G u^,, 

with strict inequality for some G Wqq. Taking the infimum of both sides 
over 9 G Ai(0), it follows that E must also be optimal in the sense of The- 
orem 2.1(i). But, by uniqueness, eo,o Ado,o has Lebesgue measure 0, which 
implies the < in (17) is an equality. A similar admissibility result for the 
region di^i can be stated as well. 

Analogous uniqueness and admissibility results hold for all the optimal 
procedures presented later on. For a discussion of admissibility in multiple 
testing problems, see [3]. 
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Theorem 2.2. Consider the case k = 2 under the assumptions given in 
Section 1. 

(i) A rule D satisfying (3) and (4) maximizes (8) is given by 

(18) di^i = {Xi>bi,X2>b2}, 

and di C {Xi > bi}, where bi satisfies (16) (so it is the same constant as in 

Theorem 2.1). Its minimum probability over A2{e) is given by Pi,^^i,.^{Xi > 
bi,X2 > 62}- 

(ii) The (stepup) rule D satisfying (3), (4) and (18) that maximizes (7) 
is given by 



(19) do,i = {Xi<bi,X2>di}, 

(20) di,o = {Xi>di,X2<b2}, 

(21) do,o = {Xi<di,X2<a2}ndl^, 
where di is determined so that 

(22) Po,oHo} = « 
and 

(23) P.J^i>ai} = P,jX2>S2}. 



The value of (23) is the minimum probability of D over Ai{e). 

(iii) The result (ii) holds for D if criterion (7) is replaced by (9), and 
(23) is a/so i/ie maximum value of criterion (9). 

Remark 2.2. Note that bi < ai < di. Also, the best minimum probabil- 
ity over Ai{e) in the case of Theorem 2.1 exceeds the best in the case of 
Theorem 2.2, but it reverses for Theorem 2.2. 

Remark 2.3. A remark analogous to Remark 2.1 apphes to the optimal 
procedure in Theorem 2.2. 

Remark 2.4. It is now clear that, subject to (3) and (4), we cannot find 
a rule to maximize both (7) and (8). By Theorem 2.1(i) and Theorem 2.2(ii), 
such a rule D would have to satisfy 

dofi = {Xi < ai and X2 < 02} 

and 

(il,i = {Xi>6i,X2>62} 

simultaneously, which is impossible because these two sets have a nontrivial 
intersection as bi < Oi. 
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Remark 2.5. The results of this paper do not hold without the mono- 
tonicity assumption. For example, consider part (i) of Theorem 2.2. Suppose 
further that Xi and X2 are independent with Xi normally distributed with 
mean 6i and variance 1. Then bi = b = zi-a, the 1 — a quantile of the stan- 
dard normal distribution. The probability of di^i under (^1,^2) with both 
6i> is always less than a and approaches a as either 0^ — > 00. Therefore, 
by adding to di^i a small enough region in the southwest quadrant, one can 
increase the rejection probability without violating the level constraint; see 
Section 4 of [13]. Such a procedure is not monotone. Similarly, regarding 
the problem addressed in (i) of Theorem 2.1, [9], Section 3, shows that the 
maximin test is not monotone. 

3. General k stepdown. Consider testing k null hypotheses Hi,...,Hi^ 
with Hi corresponding to 6i < 0. In this section and the next, we add a 
symmetry condition for the joint distribution of Specifically, 
we assume that the joint distribution of under 6i = 9 (some 

value independent of i) is exchangeable. This is not a crucial assumption 
(and actually only needs to hold at 6 = or 6 = e, where e is given in the 
statement of the theorems), but it reduces the number of critical values 
from order 2^ to k. The results should generalize, but at the expense of 
more complicated notation. 

Let 

Xri > Xr2 > • • • > Xr^ 

denote the ordered X-values, and let , . . . , Hr,, denote their corresponding 
null hypotheses. 

For any (monotone) decision rule E, let E^j denote the event that E 
rejects at least j of the null hypotheses. For e > 0, let 

Aj {e) = {{9i, . . . ,9k) '.at least j 9i satisfy 9i> e}. 

Consider the monotone stepdown decision rule D that rejects Hn , ■ ■ ■ , 
and accepts the remaining null hypotheses if X^. > Ck^i for 1 < i < j, but 
Xrj+i < Cfej+i, where the j(a) are determined by 

(24) P o,...,o {Xi > Ckj for some i,l < i < k — j + 1} = a. 

k—j-\-l times 

Then 

Dk,j = {Xr^ > Ck,i, l<i<j}- 
Note that (24) implies the important relationship 



(25) 
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if k> j >2. Also note that, for fixed k, Ckj is nonincr easing in j. 

Since the constants j depend only on k — j, we may more succinctly 
define 

(26) fk-i+i = Ck,i, 
where the fj are determined by 

(27) Po_o {max(Xi , . . . , X^) > fj} = a. 

j times 

The procedure D then rejects -ffru • • • ,Hr if and only if X^- > fk-i+i for 
l<i<j. 

Lemma 3.1. Suppose the assumptions of Section 1 and the symmetry 
condition described at the beginning of this section hold. 

(i) The above decision rule D controls the FWER at level a. 

(ii) Define 

f3k,j{a, e) = inf PeiDkj}; 

that is, /3fcj(a,e) is the minimum probability of D^j over Aj{e). Then 

(28) /?fcj(a,e)=P^{5fc,j}, 

j times 

where 

^^^^ Sk,j = i^TTjil) > fk,---, Xwj(j) > fk-j + l 

for some permutation ttj of {1, . . . , j'}}. 

So (28) is the minimum probability over Aj{e) not only of rejecting at least 
j hypotheses, but also of rejecting at least j false hypotheses. 

Theorem 3.1. Suppose the assumptions of Section 1 and the symmetry 
condition described at the beginning of this section hold. 

(i) Among monotone decision rules E that control the FWER, D maxi- 
mizes 

(30) inf P^Efc,!}. 

0GAi(£) 



Also, D maximizes 



inf Pe{Ek,2} 
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among such rules E that also satisfy Ek^2 C D^^i ■ In general, for j = 2, . . . ,k, 
D maximizes 

(31) mf Pe{Ek,,} 

among monotone rules E that control the FWER and satisfy 

(32) Ekj C -Dfcj-i- 
Therefore, for any other rule E, we must have 

inf Pe{Ekj) < pk,j{a,e) 

for at least one j. 

(ii) D also is optimal in the sense that it maximizes 

inf Pg{reject at least j false hypotheses} 

subject to (32). 

Remark 3.1. The procedure D is essentially unique (up to sets of 
Lebesgue measure 0), as described in Remark 2.1, and an admissibility result 
analogous to that described in Remark 2.1 holds as well. 

Remark 3.2. For fixed k, the optimal constants with Ckj = fk-j+i are 
given by the values 

(33) Ck,i,Ck,2, ■ ■ ■ ,Ck,k- 

But, since Ck,2 = Cfe-i,i, Ck^s = Ck-2,1, and so on, the sequence (33) is equiv- 
alent to 

Ck,l,Ck-l,l,Ck-2,l, ■ ■ • ,Cl,l- 

This is just a sequentially rejective procedure of the kind proposed by Holm 
[7]: after the first step using the critical value Cfc,i, reduce the number of 
hypotheses from k to k — 1 and repeat the first step but now with Ck-i^i, and 
so on. In the case where the Xi have a uniform (0, 1) marginal distribution 
under the null hypothesis so that we translate everything into values and 
reject for small values. Holm [7] used Ck^i = a/k since he assumed only 
the marginal distributions to be known (and strong error control follows by 
Bonferroni). Our Ck^i would then be determined by 

^0,...,0 {^i ^ Cfc^i for one or more values oi i :! < i < k} = a 

k times 

or, equivalently, 

Po,...,o {min(Xi, . . .,Xk) < Ck,i} = a. 
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If we further assume independence of the p-values, then the critical con- 
stants Ckj satisfy 

l-(l-Cfc,,)^-^+^ = a. 

Thus, the Holm principle remains in effect, except that instead of using 
Cfc,i = ct/k, the independence assumption implies the exact critical values 
Ck,i = l-il-ay/K 

4. General k stepup. Assume the conditions imposed in the previous 
section. We are testing null hypotheses Hi, . . . ,Hk with Hi corresponding 
to Oi < 0. Let 

^(1) < -'^(2) < • • • < X(^f,) 

denote the ordered X-values; in the notation of the previous section, = 

Consider the following monotone stepup decision rule D for appropriately 
chosen constants di, . . . ,dk (to be specified shortly, but assumed nondecreas- 
ing). If X(^i^ > di, then reject all null hypotheses. Otherwise, if < di 
but X(2) > ^2) reject the k — 1 hypotheses corresponding to the k — 1 largest 
X's. In general, for the smallest j such that > dj, reject the k — j + 1 
hypotheses corresponding to the k — j + \ largest X's and accept the re- 
maining. (Note that the constants dj should perhaps be written as d^^ to 
show the dependence on k; however, we will see that d^j will be chosen to 
be independent of k and so we just abbreviate to dj.) 

The above rule rejects at least j null hypotheses for the set j defined 

by 

Dk,j = {^{i) > di} U • • • U > dk~j+i}- 

Equivalently, at least k — j + 1 hypotheses are accepted if j occurs, where 

Dlj = < di} n • • • n < 4-j+i}. 

Evidently, 

Dk,j+i C Dkj. 
The constants dj are determined so that 

(34) Po^{Lj} = l-a, 

j times 

where 

(35) Lj = {X7r(i) <di,..., ^7r(j) < dj for some permutation of {1, . . . , j}}- 

Note that the constant dj does not depend on k as reflected in the no- 
tation. Also, di = Ck^k = ci,i = /i, where ci^i and /i are the constants (24) 
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and (26) of the previous section. However, as pointed out by an anonymous 
referee, in the case A; > 2, it need not be the case that d^j is nondecreasing 
in J. A counterexample is provided in [4]; some further references on the 
monotonicity of critical values are [1] and [15]. In order to prove our results, 
we need to assume the monotonicity holds. 

Lemma 4.1. Assume the conditions of Lemma 3.1. Also assume that the 
constants dj used in the procedure D are nondecreasing in j . 

(i) The above decision rule D controls the FWER at level a. 

(ii) Define 

that is, Pk,j{oi,£) is the minimum probability of Dkj over Aj{£). Then 

(36) Pk,j{a,£) = Pe^{mm{Xi,...,Xj) > 

j times 

The minimum probability over Aj{£) of rejecting at least j false hypotheses 
is also given by (36). 

Theorem 4.1. Assume the conditions of Theorem 3.1. Also assume that 
the constants dj used in the procedure D are nondecreasing in j. 

(i) Among monotone decision rules E that control the FWER at level a, 
D maximizes 

(37) inf Pe{Ek,k}. 



Also, D maximizes 



inf Pe{Ek^k-i} 



among rules that satisfy D^^k C In general, for j = k — 1, . . . ,1, 

D maximizes 

(38) mf Pe{E,J 

among monotone rules E that control the FWER and satisfy 

(39) Dkj+i C Sfcj. 
Therefore, for any other rule E, we must have 

inf PeiEkj) < i3k,j{a,e) 

d£Aj(£) 

for at least one j. 
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(ii) D also is optimal in the sense that it maximizes 

inf Pg{reject at least j false hypotheses} 

eeAjie) 

subject to (39). 

Remark 4.1. Again, the procedure D is unique up to sets of Lebesgue 
measure 0, and it is admissible; see Remark 2.1. 

Remark 4.2. Letting 

denote the ordered values of just the first j X's, the constants dj are deter- 
mined by 

-Po,...,o {Xj:i <di,. . . ,Xj.j <dj} = 1 - a. 

j times 

If we compare this with (27), we see that fj < dj, except when j = 1, in 
which case /i = di . 

5. Conclusions. Stepdown and stepup methods were proposed as intu- 
itively appealing by Holm, Hochberg, Dunnett and Tamhane, and others. 
The present paper, treating the case of one-sided alternatives only, used 
optimahty criteria that seemed reasonable and were not selected to justify 
predetermined solutions. It is gratifying that the results confirm the intu- 
ition of the originators of these methods. Even though our assumptions are 
strong, some stepwise methods can now be viewed as asymptotically optimal, 
such as the stepup method of Dunnett and Tamhane [2]. Outside the strong 
assumptions imposed in this paper, Westfall and Young [18] give general 
resampling methods to approximate the critical values of stepdown proce- 
dures, while Troendle [17] addresses the corresponding problem for stepup 
procedures. 

6. Proofs and auxiliary results. 

Proof of Theorem 2.1. First, observe that for the procedure D given 
in (i), for 9 G a;o,o, 

Peido^o} < Po,o{do,o} = PoA^i > «i or ^2 > 02} = a 

by choice of Oj. For this D, by monotonicity, the inf over 9 G Ai{e) in (7) 
occurs at (^1,^2) = (^i,— 00) or (—00, £2); this is a shorthand notation so 
that 
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But then 

Peu-oo{d'o,,} = PeAXl>a^} 

and 

-P-oo,e2{'^0,o} = Pe2{^2 > «2}- 

So, the value of criterion (7) for the procedure D is indeed given by (12). 
Similarly, the value of criterion (9) for D is also (12). Indeed, as 6i — oo, 
the chance that Hi is incorrectly rejected tends to 0. 

To prove (i), suppose E is another decision rule satisfying (3) and (4). 
Assume there exists {xi,X2) ^ do^, but {xi,X2) € eo,o- Then there exists at 
least one component with Xj > aj, say xi > ai. Hence, 

^ei,-oo{eo,o) > Pe^,-oo{Xl < XuX2 < X2} = PsA^l < Xl} > P^J^l < Ol}- 

Therefore, 

Pe„~oo{e^O,o} < Psu~oo{Xi < ai} = PeAXi < ai}, 

so that E has a smaller value of criterion (7) than does a claimed optimal 
D. So it must be the case that eo,o C (io,o- But, if eo,o is strictly contained in 
dofi such that the set difference eo,o A do,o has positive Lebesgue measure, 
then its region for rejecting u;o,0) namely, Cqq, is bigger than c^qq, implying 

^o,o{eo,o}>^'o,oK,o} = a- 
The conclusion is that an optimal region D must have the stated region (10) 
"0,0- 

To prove (ii), let us first check that the claimed solution controls the 
FWER. For 9 G a;o,o, 

Peid^o} < « 
as previously argued. For 9 G wo,ij 

Pe{Type 1 error} = Pe{di^i U di^} < Pe{Xi > 61} < Po{Xi >bi} = a 
similarly for ujifi. 

The goal now is to find D satisfying (3), (4) and dg g given by (10) to 
maximize (8). Consider another rule E satisfying (3), (4) and eo,o = do^Q. 
Suppose there exists {xi,X2) € ei^i such that Xj < bi for some i, say i = 1. 
Then 

Po,oo{ei,i} > Po,oo{Xi >xi,X2> X2} = Po{Xi > xi} > Po{Xi > hi) = a, 

which would contradict strong control. So ei^i C di^i. But you cannot take 
away points from di^i without lowering the minimum power at {61,62) = 
{e,e). 
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To prove (iii), simply observe, for any 6, 

{rejecting at least one false Hi} < {rejecting at least one Hi}, 

and so 

inf {rejecting at least one false Hi} < inf {rejecting at least one Hi}. 

But the right-hand side is Psj^{Xi > ai}, and so it suffices to show that D 
satisfies 

inf Pg{D rejects at least one false Hi} = Pg, {Xi > aA. 

But the earlier argument for (12) showed this to be the case. □ 

Proof of Theorem 2.2. To prove (i), suppose E is another rule sat- 
isfying (3) and (4) which rejects both hypotheses if {Xi,X2) € ei^i. Suppose 
there exists {xi,X2) € ei^i such that Xi < hi for some i, say i = 1. Then 

i^o,oo{ei,i} > Po,oo{^i >xi,X2> X2} = Pq{Xi >xi}> Pq{Xi > bi} = a, 

which would contradict E control of the FWER. So, ei^i C di^i. But you 
cannot take away any point from di^i without lowering the minimum power 
at (ei,e2)- 

To prove (ii), note that, for the claimed solution the value of (7) is given 

by 

inf P,(4o} = ^.i,-oo{4o} = ^'.i{^i >«i}- 
e-.eeAiie) 

We now seek to determine do,o [hke Theorem 2.1(i) with the added constraint 
that dofi C dii]. To prove optimality of the claimed solution, suppose E 
is another rule satisfying (3), (4) and ei^i = c?i_i, with di^i given by (18). 
Suppose {xi,X2) ^ dofi, but {xi,X2) G eo,o, so that Xi > di for some i, say 
i = l. Then 

Pei _oo{eo,o} > Pei-oo{Xi < Xi,X2 < X2} 

= PsA^l <Xl} > Pe^{Xl > di}. 

Therefore, 

^'£,-oo{eo,o} < Pe{Xl > di}, 

so that E cannot be optimal. So it must be the case that eo,o C do,o- But if 
eo,o is strictly contained in do^, its region for rejecting a;o,0) namely, Cqq, is 
bigger than q , in which case 

^o,o{eo,o} > ^o,o{f^o,o} = «, 
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a contradiction of strong control. 

Finally, we check that D itself exhibits control of the FWER. For 9 € a;o,o j 
the probability of a Type 1 error is < a because of (22). For 6 = (^1,^2) £ 
^0,1 1 

Pe{Type 1 error} < Po,oo{^i > > 62 U Xi > 01,^2 < 62} 

= Po{^i>6i} = a, 

as required. 

The proof of (iii) is completely analogous to the proof of Theorem 2.1(iii). 

□ 

Proof of Lemma 3.1. To prove (i), suppose Hi, . .. ,Hp are true and 
Hp^i, . . . , Hk are false. A Type 1 error occurs if any of Hi, . . . , Hp are re- 
jected. For the rule D, the set where a rejection of any of Hi, . . . , Hp occurs 
is a monotone increasing set, and so the probability of this event is largest 
under this configuration of true and false hypotheses when 

(6*1, ...,0f,) = (0,..^., 0, 00,. . ., 00 ), 

p times k—p times 

and this probability is equal to 

-Po,...,o {Xi > fp for some i = 1, . . . ,p} = a 

p times 

by (27) with j = p. 

To prove (ii), note that the minimum power occurs when 9 is one of the 
(^") points with j values of e and k — j values of —00, such as 

(40) Wk,j = Wk,j{e) = (£,.^.,£, -oo,.^.,-cx) ). 

j times k—j times 

Then, Pw,.^{Di^j) reduces to /3fcj(a,e) as claimed. Also, for such a config- 
uration Wkj, only the j hypotheses Hi,. . . ,Hj can be rejected, and so the 
minimum probability of rejecting at least j hypotheses is the same as the 
minimum probability of rejecting exactly j hypotheses (and it is also equal 
to the probability of rejecting exactly j false hypotheses). □ 

Before the proof of Theorem 3.1, we need two lemmas. We will make use 
of the following notation. If R is any region in R'^, let 

R"" = {{xi,...,Xk-i) : {xi,...,Xk-i,z) eR}. 

Lemma 6.1. Let R be any monotone rejection region in R'"' [so x = 
{xi, . . . ,Xk) & R implies y G R if yi>Xi for all i] . 
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(i) If zi<Z2, then R""^ C R^^ . 

(ii) R^ , Uz-^^ ^'^'^ Clz^^ ^'^s monotone rejection regions in R*^~^. 

Proof. If zi < Z2 and (xi, . . . , xi^_i) £ R^^ , then (xi, . . . , -^i) G -R- 
By monotonicity, (xi, . . . , Z2) S ^, and so (xi, . . . , G i?^^. The 

proof of (ii) is just as easy. □ 

Lemma 6.2. Assume the distributional assumptions given at the end of 
Section 1. Let R be any monotone rejection region in R'^. Then for any 
values of the parameters 61, ... , 6k-i, 



^'ei,...A_i,oo(^) - ^'ei,...A-i|U-^''| 



(41) 
and 

(42) Peu...,9,-u~oo{R} = Pe 



Proof. To prove (41), 

= hm Pe,,...,eJ(Xl,...,Xfc_l)Gi^^^Xfc>4 

tffc— >oo 

Also, for every z, 

Pe^,...,e,_„oom = Pe^,...,e,KXu ■ ■ • ,^fc-i) e > 2} 

> Pei,...,ek-i{iXi, ■ ■ ■ ■,Xk_i) G R^} 

and so 

^'ei,...A-i,oo{i?} > Pe„...A.-:{(^l, ■ • ■ e U^'}' 

and (41) follows. 
To prove (42), 



-P6»i,...,0fe_i,-oo(-R) 



= lim Pe,_e,{{Xi,...,Xk-.i)£R'"^} 

00 

= lim P9,....,eJ(Xi,...,Xfe„i)Gi?^^Xfc,<z} 
<P,,,...,e,_J(Xi,...,Xfc_i)Gi?^} 
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for every z. Let z — > — cx), so that W decreases 'lo^W . Then we can conclude 
-P6li,...,6»fc_i ,-oo(^) < ^ei,...,6»ft_i{(-'^l,- ■ -.^k-x) ^{\^^\- 

Also, 

^ei,...,0,_i,-oc{i?} = lim • • • ^ 

>P,„...,e,_,{(^i>---,^;t-i)en^'}' 

and the result follows. □ 

Next, given a monotone rejection region ii, define 

z 

u\r) = u\u\r)) 

and 

W{R) = U\W~\R)). 

Similarly, let 

lHR)=f]R^ 

z 

and 

P{R)=I\P-\R)). 
By applying Lemma 6.2 repeatedly, we also obtain 

(43) Pe„...A-„oo_oom = Pe,,...,9,_,{U^m 

j times 

and 

(44) Pe„...,e,_„-oo,...,-oo{R} = Peu...,e,^Al'm- 

« ' 

j times 

Proof of Theorem 3.1. (i) Note, for any monotone rule E, the small- 
est probability of E'^j over Aj{e) occurs when 9 = Wk^i defined in (40), as 
well as when 6 is any permutation of Wk,i- Furthermore, for any monotone 
rule E that controls the FWER, we must have 

when is 

(45) Vk,j = ( oo,. . . ,oo , 0^^^^^), 

j — l times k—j+1 times 
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or permutations of Vkj- 

To prove the optimality result (30), consider another rule E, with ^ 

the subset of R*^ that accepts all null hypotheses. Suppose there exists 
X = {xi, . . . ,Xk) ^D'f, ^, but X ^ E'f,^. Then there exists at least one com- 
ponent of X, say xi, with xi > Ck^i- By monotonicity, the set 

(46) L(x) = {yeR^y,<xJ 

is also in E^^^. Then 

P^.A^li) > Pw.A^ix)} = Pe{Xi < xi} > P,{Xi < ck,i} = 1 - (3k,ii<^,e), 
and so the smallest power of E over Ai(e) satisfies 

Therefore, in order for E to be optimal we must have 

P'k,! ^ P'k,l- 

But if Dk^i is a proper subset of Ek^i (except for a set with Lebesgue 
measure), then 

^o_o {Ek,i} > Po_o {Dk,i} = a, 

k times k times 

a contradiction if E controls the FWER. Therefore, (30) is proved. 

To prove the result (31) with j = k,\et E be any other monotone decision 
rule which has strong control and satisfies the constraint 

Suppose i^fc^fc includes a point y = {yi, . . . , y^), where yi > Ck^i for i = 1, . . . , /c — 1 
and yk <Ck,k- Then 

,o{Ek,k} > -Poo,...,oo ,o{Xri > yi, • • • ,Xr^,_-^ > Vk-l^X^ > yk} 



= Po{Xk > yk} > Po{Xk > Ck,k} = a, 

a contradiction of strong control. So such a point y cannot be in Ek^ki nor can 
any permutation of the coordinates of y (by a similar argument). Therefore, 
Ek^k can at most include Dk^k- But taking away any points from Dk^k could 
only lower the minimum power at (e, . . . ,e), and so Dk^k is optimal. 

To prove the result (31) with 1 < j < k, let E be any other monotone 
decision rule which has strong control and satisfies the constraint (32). Let 

(47) Xj.,i>Xj.,2>--->Xj.,j 
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denote the ordered values of Xi ,Xj. Since E has strong control, it follows 
by (43) that 

P o_o {U^-HEk,,)} = a. 

k—j-\-l times 

Hence, U^~^{Ef:j) can be viewed as a rejection region in R'^"-'"'"^ for the 
case with k and j replaced by k' = k — j + 1 and j' = 1. [Note that if E^j 
satisfies the constraint Ekj C Dkj-i, then 

so the constraint is vacuous.] It follows that 
— oo,...,— oo 

V ' 

k — j times 

or 

-Pq,— oo,...,— oo, oo,...,oo 

k — j times j — 1 times 

By the same reasoning applied to any permutation of 

= (0, — oo, . . . , — oo, CX3, . . . , oo), 
k—j times j — 1 times 

Pq, oo,...,oo ,— oo,...,— oo{-E'fc,j} 

j — 1 times k—j times 

= Po,^_oo {/^■"^■(^fc,j)} < /3fc-j+i,i(a,0). 

j — 1 times 

So I^~^{Ek^j) is a rejection region in R-' that controls the Type 1 error at 
the point 

(0, oo, . . . , oo) 

j — 1 times 

(as well as at permutations of its coordinates), not at level a, but at level 
/3fc_j+i^i(a, 0). [In words, if you use the rule E which is originally designed 
to test k hypotheses, but you ignore the last k — j hypotheses, the overall 
probability of a Type 1 error for testing the j hypotheses is reduced to 
/3fc_j_i_i^i(a, 0).] Also, note that the constraint Ekj C Dkj-i implies 

l''-\Ekj) C I'^-^DkJ-l) = {Xj.,l > Ck,l, . . . > Ck,j-l}. 

(Note that Ckj always refers to the critical values based on the given value 
of a, so its dependence on a is suppressed.) Then, by the case with k and j 
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replaced by j and j (already proved above) and a replaced by (3k-j+i,i{oi, 0), 
it follows that 



or 

(48) {Ekj} < (3jj{f3k-j+i,i{a,0),e). 

j times k — j times 

We must argue that the right-hand side of (48) is (3kj{a,£). But notice that 
if we apply the above reasoning to E = D, the inequalities are all equalities. 
Indeed, 

U^~^{Dkj) = {at least one of Xi, . . . ,Xk~j+i ^ ^k,j} 

and the optimal minimum power (with e = 0) for the subproblem with k' = 
k — j + I, j' = 1 and a' = a is /3fc_j+i_i(a, 0). Also, 

is optimal for the case k" = j" = j at the level a" = /?fc_j+i_i(a, 0). Indeed, 
checking the level condition, 

PO, ^_oo {I''-' {Dk,j)} = Po{Xi > Ck,j} 

j — 1 times 

= Po{Xi > Cfc-.j+i,i} = /Jfc-j+i,i(a,0). 

So, by the case k" = j" = j, 

Pe^ {I'-HEk,,)} < Pe^ {I'-'iDkj)} 

j times j times 

= {Xj : 1 > Cfc,i, ...,Xj.,j> Ckj} = Pk,j{a, e)- 

j times 

The proof of (ii) is completely analogous to the proof of Theorem 2.1(iii), 
with the help of Lemma 3.1(ii). □ 

Proof of Lemma 4.1. To prove (i), suppose Hi, . . . ,Hp are true and 
Hp^i, . . . , Hk are false. A Type 1 error occurs if any of Hi, . . . , Hp are re- 
jected. For the rule D, the set where any of Hi, . . . , Hp is rejected is a mono- 
tone increasing set (invoking the monotonicity of critical values) . Hence the 
probability of this event is largest under this configuration of true and false 
hypotheses when 

(6*1, ...,ek) = (0,..^.,q, oo,. . .,oo ), 

p times k~p times 
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and this probability is equal to 
Po,.. .,0 , oo,...,oo 

{reject any of iJi, . . . , Hp} 

p times k—p times 

(49) = Po,.. .,0 , oo,...,oo 

{reject any oi Hi, . . . ,Hp 

p times k — p times 

n reject all of i/p+i, . . . , i^fc}, 
because as (6*1, ... , 9k) — > (0, . . . , 0), -^(p+i) > (ip+i with probability tending 

p times 

to one, and so the hypotheses Hp^i, . . . , H^ are rejected with probability 
tending to one. Then (49) is bounded above by 

-Po,...,o, oo,...,oo {at least k—p + l rejections} 

p times k—p times 

= Po,...,0 , oo,...,oo {Dk,k-p+l} 

p times k—p times 

= 1 — -Po,...,0 , oo,...,oo {-'^(l) < C^l, • • • < dp} 

p times k — p times 

= '^- Po^{Lj} = a, 

p times 

by (34) and (35). 

To prove (ii), note that the minimum power occurs when 6 is one of the 
(j) points with j values of e and k — j values of —oo, such as w^j given by 
(40). Then 

PwkjiDkj) = 1 - Pwk,j{^{l) <di,.. . ,X(fc_j4.i) < dk^j+i} 

= 1 - {{Xi < dk-j+l} U---U{Xj< 4-i+i}}, 

j times 

which reduces to I3kj{a,e) as claimed. □ 

Proof of Theorem 4.1. To prove (37) (the case j = k), first observe 
that 

^fc,fc = {^(i) >di}- 

Consider another monotone rule E, and suppose there exists some point 
X = {xi, . . . ,Xk) with X £ Ek^k but x ^ Dk^k- Then there exists at least one 
component of x, say xi, with xi < di. By monotonicity the set 

M{x) = {y G > Xi} 
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is also in E^^k- Then 

-Pq, oo,...,oo {Ek,k} > Po, oo,...,oo 

{M{x)} 



k — 1 times fc — 1 t imcs 



= Po{X, > xi} > Po{Xi >di} = a, 

which would contradict strong control. So we must have E^^k C -Dfc,fc- But 
then 

P e,...,e {Ek,k} < P e,...,e {E>k,k}i 

fc times fc times 

and so (37) is proved. 

To prove the result (38) in the case j = 1, the constraint is that Ek^i must 
contain Dk^2^ or, equivalently, 

fc-i 

i=l 

Suppose X = (xi, . . . , Xk) G -E'fc 1 but x ^ -D£ ^. For the sake of argument, 
assume the Xi are nondecreasing in i with Xi < di for i = 1, . . . ,k — 1 (so 
the constraint is satisfied), but Xk > dk- Then x G E'^^ implies L(x) £ E^^, 
where L{x) is defined in (46). So 

-f— oo,..., — 00,£ {E'k,i}>P^ oo,...,— oo.e 

{L(x)} 

fc— 1 times 

= Pe{Xk<Xk}>Ps{Xk<dk}. 

,,e{Ek,i} < Pe{Xk > dk} = PkAa,e), 



fc— 1 times fc— 1 times 



Therefore 



and so Ek^i is less powerful than Dk^i- Therefore such a point x cannot 
exist in order for Ek^i to be optimal. (A similar argument applies to any 
permutation of the coordinates of x.) Then x € Dk^i implies x G Ek^i- But 
adding any points x to Dk^i would increase the probability of rejection when 
= (0, ...,0), and this would contradict the level constraint. So the case 
J = 1 is proved. 

To prove (38) for 1 < j < k, let E be any other monotone decision rule 
which has strong control and satisfies the constraint (39). Since the set Ekj 
cannot have probability greater than a when 9 = Vkj, where Vkj is given by 
(45), we must have 

a > Pv,jEkj} = P o_o {U^-HEk,j)} 

k-~j-\-l times 
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by (43). Therefore U^~^{Ekj) is a region in R'^~-'+^ which has rejection 
probabihty a when = (0,...,0). Note that the constraint E^j D D^j+i 
imphes the region U^~^{Ekj) must contain 

Therefore, by the case considered above with k' = k — j + 1 and j' = 1, the 
optimal region in R'^"-'"'"^ is 

-j + l:k-j + l > di} U • • • U {Xk_j^i,i > dk-j + l}, 

which, in fact, is equal to (/^"^{Df^-j). So 

P0,-oo,...,--oo{U'~HEk,j)} < Po.~oo,...,-oo{U^^\Dk,j)} 

^ „ ' ^ , ' 

k — j times k—j times 

= Po{Xi > 4-j+i} = /3fc-j+i,i("iO). 
Using (43) and applying the argument to any permutation of Vkj, we have 
, oo,...,oo ,— oo,...,— oo {Ek,j} < /3fc-i+i,i(a,0), 

j — 1 times k—j times 

or by (44), 

j+i,i(a,0). 

So /'^"-'(E'fc j) is a rejection region in R-' that controls the Type 1 error at 

(0, oo, . . . , oo) 

j~l times 

(as well as permutations of its coordinates), not at level a, but at level 
Pk-j+i,iioi,0)- [Also note that the constraint E^j D Dk,j+i implies 

l''-^{Ek,j)Dl''HDk,j+i)=0, 

which is always satisfied.] By the case with k" = j" = j and a" = (3k-j+i,i{o, 0) 
considered above, 

I^-\Dk,j) = {min(Xi, ...,Xj)> 4-j+i} 

is optimal for this case and so 

Pe^ {I^-'iEk^j)} < Pe^ {/'-^(Dfc,,)} = PkA^,e) 

j times j times 

by Lemma 4.1(ii). Therefore 

Pe,...,e ~oo,...~oo{Ekj} < Pk,j{oi,£), 

j times k—j times 

as was to be proved. 

The proof of (ii) is analogous to the proof of Theorem 3.1(ii). □ 
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